-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Costs] Framework #913
[Costs] Framework #913
Conversation
I can't reproduce the failure locally. Is this a python 3.10 vs 11 thing? |
@tanujkhattar ptal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Looks pretty good!
# will be cached. | ||
def _get_cost_val_internal(callee: 'Bloq'): | ||
return _get_cost_value(callee, cost_key, costs_cache=costs_cache, generalizer=generalizer) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a docstring for part b.
? It's a little awkward as a reader to see part (a) and then nothing else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whoops done
This function can be used to annotate a call graph diagram with multiple costs | ||
for each bloq. Specifically, the return value of this function can be used as the | ||
`bloq_data` argument to `GraphvizCallGraph`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My only concern here is that this function assumes an implicit "call graph" where the leaf nodes are the ones which have my_static_costs
implemented. Does it make sense to change the API to accept either (a) an nx.DiGraph
returned by calling bloq.call_graph()
or (b) *bloqs
instead of bloq
. So that one can query costs for a call graph where intermediate nodes may have explicit annotations on them but there's still value in visualizing the costs of their callees.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see.. you want to force inclusion of bloqs that would not be traversed in the actual computation of the cost. This can always be added as a new function (the body is a simple wrapper over the other functions); but I'd be wary of situations where you'd want to show callee costs that don't actually factor in to their parents' (callers) computation
can optionally provide a value, which will be preferred over a computed cost. | ||
|
||
Static costs can be provided if the particular cost cannot be easily computed or | ||
as a performance optimization. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One issue here is that my_static_costs
method does not have access to the costs cache. So, if it were to do any non-trivial computation where it needs to compute its cost in terms of the cost of its callees, then it wouldn't be able to pass on the cache to get_cost_value
method.
We had faced this issue with TComplexity
protocol and that's why we ended up using a global cache. In this framework, we should maybe accept a cache dictionary as another argument to this method since the call site already has it ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah the intent here is for my_static_costs
to be used rarely and only for truly static costs that you can just declare. Consider it the generic analog to the _t_complexity_
override where you're really only allowed to fill in numbers.
The cache is available to the compute
method, which is analogous to every strategy except the t_complexity._from_explicit_annotation
strategy in the t complexity protocol.
I'll update some docstrings to clarify the expected use of my_static_costs
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that, but I think there's value in supporting the use cases where my_static_costs
needs to do something more complex than specifying just numbers.
An example would be supporting the cost key for circuit depth
. The compute
method would correspond to calling bloq.decompose_bloq()
and then (one strategy would be to) find the longest path in a weighted DAG where the weights are given by depth of each subbloq. This will take time at least O(N) if the decomposition has N
nodes (unlike other costs like T-complexity, which scale as O(M) where M is number of unique callees and often M << N)
Now, if we stick to the strategy where my_static_costs
specifies just numbers and compute
works as described above, the depth computation would be prohibitively slow for a wide variety of bloqs where decomposing the bloq would result in a large number of nodes. Some concrete examples are bloqs based on unary iteration, arithmetic circuits like adders etc.
In all these examples, as bloq authors we very well know the circuit structure and can easily annotate my_static_costs
. But it's not enough to specify "a number" - we need the my_static_costs
to be an expression in terms of costs of its callees. For unary iteration based bloqs, the my_static_costs
would be something like c * N + sum(get_bloq_cost(bloq_for_nth_operation(n)) for n in range(self.selection_range))
- i.e. a constant for the unary iteration tree + depth for each controlled operation. For arithmetic circuits, we'd likely want to specify costs as a function of get_cost(Toffoli)
; which users can choose to specify by providing the cost dictionary since a Toffoli can have multiple different decompositions optimizing depth vs T-counts etc. and users can choose which decomposition should be used by providing a cost for depth of Toffoli
.
Tl;Dr - I think its restrictive to assume that my_static_costs
would only ever specify truly static costs and I think there's value in making it extensible to support the use case where it'll recursively need and use costs for it's callees.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the idea is to group any of this logic in the compute method, which can dispatch based on the type of bloq (analogous to how we provide costs for leaf bloqs in the existing t_complexity protocol).
We should also aim to reduce the number of bloqs where doing an O(N) operation is considered prohibitively costly. In your example, you still need to iterate over self.selection_range
bloqs. I guess the presumption is that the constant term absorbs a number of bit-twiddling subbloqs that meaningfully exceeds the selection_range; but we should probably drive our efforts towards encoding the "simple" structure of unary iteration in the bloq decomposition hierarchy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the idea is to group any of this logic in the compute method, which can dispatch based on the type of bloq (analogous to how we provide costs for leaf bloqs in the existing t_complexity protocol).
Yeah, I think that would just not be feasible for the cost types that need to look at the decompose composite bloq instead of just the bloq counts. I've raised this concern before and I provide a very concrete exampel above.
In your example, you still need to iterate over self.selection_range bloqs. I guess the presumption is that the constant term absorbs a number of bit-twiddling subbloqs that meaningfully exceeds the selection_range
For this specific example, the presumption is that building the composite bloq can be much slower than iterating over the callees, even if both of these take O(N)
time. #609 (comment) was an example of how much of a different this can make.
We should also aim to reduce the number of bloqs where doing an O(N) operation is considered prohibitively costly
Well, that's a very optimistic goal and I don't think it aligns well with the rest of the philosophy of Qualtran. One of the reasons we have the concept of a call graph is because we think there will be cases where constructing a composite bloq in O(N)
time would be prohibitively more expensive than just specifying a list of callees and their bloq counts.
The same logic applies to costs that depend upon the structure of the decomposed composite bloq, like depth
. There will be cases where computing the composite bloq and then processing it to figure out the cost would be prohibitively more expensive than writing an expression for computing the cost in terms of the callees. For costs like T-complexity, this "expression" is simple and can be generalized and put as part of the compute
method but for costs like depth
this "expression" needs to live inside the bloq. That place can be my_static_costs
(or we can rename it to be more descriptive for this use case). But as the design stands right now, we don't support this flow at all where users can write the expression for computing a cost for a bloq (in terms of its callees) as part of the bloq itself.
For some more concrete numbers, here are the timings to construct a composite bloq for a 1000
and 5000
bit adder. You can see this is already getting very expensive and these are numbers which we encounter often.
So now, if a user were to compute the cost of an adder for depth
, they would either not override my_static_costs
and we'll pay the time penalty of calling adder.decompose_bloq()
(which is huge) OR the user would need to encode an expression that hardcodes the depth of subbloqs like AND
gates; which isn't ideal and is similar to what we were doing in T-complexity and got rid of by introducing the call graph hierarchy. The ideal solution here would be to allow users to specify things like adder_depth = c1 * cost(And()) + c2 * cost(And().adjoint()) + c3 * cost(CNOT())
where c1
, c2
and c3
are constants which are known to the bloq authors because they know the structure of the decomposition. A function similar to my_static_costs
can be used for such annotations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can put that logic in the compute
method.
def compute():
if isisntance(bloq, Add):
return c1 * cost(And())
return sum(depth(b) for b in b.decompose())
Remember: the extensibility axis is to support adding more costs. This is particularly acute for circuit depth, which isn't the most important cost for certain architectures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can put that logic in the compute method.
This means every time a bloq author writes a bloq which is slow to decompose for large instances, they would have to update the compute
method of the cost key - this does not scale at all (unless we assume all bloqs are authored by us and we maintain a central repository of these if / else overloads in the compute method of the cost key).
Remember: the extensibility axis is to support adding more costs. This is particularly acute for circuit depth, which isn't the most important cost for certain architectures
The argument you have given above completely ignores the ease of extensibility for more costs. Since the PR is already merged now, I'll open a new issue to track this so that whenever we end up implementing a cost like depth
which depends upon the structure of decomposition, we can revisit this discussion.
@@ -0,0 +1,32 @@ | |||
# Copyright 2023 Google LLC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update 2023 to 2024 everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gotta update my pycharm setting that puts these in automatically :)
CostKey
which can be overriden to define a "cost" (type).Part of #899
Actual implementations of costs to follow.
For something like the existing t complexity protocol: the recursive computation would be wrapped in a
CostKey
implementation and the bespoke_t_complexity_
override would move tomy_static_costs
. We use theNotImplemented
return value instead of usinggetattr
to see if the method itself exists or not